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I.  INTRODUCTION 


Linked  data  structures  form  an  integral  part  of  many  software  and  database  systems.  Per¬ 
forming  error  detection  and  correction  to  preserve  the  correctness  of  data  structures  is  important  in 
achieving  overall  system  reliability.  To  reduce  the  performance  degradation  incurred  through  their 
use.  detection  and  correction  should  ideally  be  executed  concurrently  with  normal  processing,  and 
every  invocation  of  these  procedures  should  be  completed  in  O(l)  time.  If  any  global  checking 
information  (e.g..  a  global  count)  is  used  in  detection  or  correction,  then  0(n)  nodes  must  be 
accessed,  where  n  is  the  number  of  nodes  in  the  structure,  and  those  procedures  cannot  run  in  0(1) 
time.  In  addition,  since  node  access  time  is  the  major  contributing  factor  to  the  cost  of  error  detec¬ 
tion.  the  number  of  nodes  accessed  should  be  minimised.  The  Checking  Window  concept  is  intro¬ 
duced  in  this  paper  as  a  method  of  formalizing  these  ideas,  and  as  a  method  of  describing  local  con¬ 
current  error  detectability  as  a  function  of  the  number  of  nodes  to  be  checked.  To  preserve  the 
structural  integrity  of  linked  data  structures,  a  new  approach  to  detecting  and  correcting  structural 
errors,  called  the  virtual  backpointer,  is  also  introduced  in  this  paper.  The  technique  is  used  to 
construct  two  new  data  structures:  the  Virtual  Double-Linked  List  and  the  B-Tree  with  Virtual 
Backpointers.  The  Virtual  Double-Linked  List  uses  the  same  amount  of  storage  as  the  double- 
linked  list  from  which  it  is  derived.  The  B-Tree  with  Virtual  Backpointers,  derived  from  the  B- 
tree  of  order  m.  requires  m+4  more  fields  in  each  node.  It  is  shown  that  0(1)  local  concurrent  error 
detection  can  be  performed  for  both  structures,  and  that  0(1)  correction  is  possible  for  those  errors 
detected  during  forward  moves  through  the  structures.  Correction  for  those  errors  detected  during 
backward  moves  through  the  structures  is  in  worst  case  O (n). 

The  foundation  work  concerning  robust  data  structures  was  performed  by  Taylor.  Morgan, 
and  Black  [l].  Several  techniques  have  since  been  presented  to  achieve  robust  data  structures;  how¬ 
ever.  most  achieve  error  detection  in  0(n)  time.  A  global  count,  as  used  by  Taylor.  Morgan  and 
Black  in  the  modified(k)  double-linked  list,  the  chained  and  threaded  binary  tree,  and  the  robust 
B-tree  [1-3],  by  Munro  and  Poblete  in  their  isomorphic  binary  tree  [4],  by  Sampaio  and  Sauve  in 
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their  robust  binary  tree  [5],  and  by  Seth  and  Muralidhar  in  their  mod(2)  chained  and  threaded 
binary  tree  [6],  necessitates,  for  some  errors,  a  traversal  of  all  the  nodes  of  the  structure  for  error 
detection.  The  three  pointer  tree,  as  explained  by  Yoshihara  et  al.  [7]  requires  O(n)  time  to  detect 
double  errors,  since  a  preorder  traversal  of  all  the  nodes  of  the  tree  is  performed.  Though  not  indi¬ 
cated  in  their  paper,  error  detection  can  be  performed  in  O(l)  time  using  the  D-loops  within  the 
structure,  but  only  single  errors  can  be  detected.  Kuspert’s  work  with  the  separately-chained  hash 
table  [8],  which  is  an  application  of  double-linked  lists,  achieves  detection  in  0(l)  time:  however, 
five  extra  fields  must  be  stored  in  each  node. 

A  general  theory  of  local  detectability  and  local  correctability  has  been  introduced  and  for¬ 
malized  by  Black  and  Taylor  [9],  and  has  been  successfully  applied  to  several  different  types  of 
data  structures,  including:  the  spiral(i)  list  [9],  the  LB-tree  [9-10],  the  mod(*)  list  [ll],  the 
helixGfc)  list  [12],  and  the  AVL  tree  [13].  The  intention  of  their  work  is  to  be  able  to  correct  an 
arbitrary  number  of  errors  in  a  data  structure,  provided  the  errors  are  sufficiently  separated  from 
each  other.  However,  the  complexities  of  the  correction  algorithms  (which  include  error  detection) 
are  typically  not  0(1). 

The  organization  of  this  paper  is  as  follows.  Section  II  presents  an  analysis  of  local  concurrent 
error  detection,  giving  formal  definitions  for  Checking  Windows  and  local  concurrent  error  detecta¬ 
bility.  In  Section  III.  the  virtual  backpointer  concept  is  described  and  is  used  to  construct  two  new 
data  structures:  the  Virtual  Double-Linked  List  and  the  B-Tree  with  Virtual  Backpointers.  The 
local  concurrent  error  detectability  and  correctability  of  each  structure  is  analyzed.  Section  IV 
describes  a  concurrent  auditor  process  as  applied  to  data  structure  error  detection,  analyzes  its 
effectiveness  in  increasing  the  mean  time  to  failure  of  a  system,  and  presents  the  results  of  an 
implementation.  Finally.  Section  V  summarizes  the  results. 
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II.  LOCAL  CONCURRENT  ERROR  DETECTION  AND  CORRECTION 


Local  concurrent  error  detection  (LCED)  is  an  on-line  technique  for  detecting  structural  errors 
in  a  locality  of  a  currently  accessed  node  in  a  linked  data  structure.  If  the  size  of  the  locality  is 
constant  and  the  degree  of  each  node  is  fixed,  then  an  LCED  procedure  will  run  in  O(l)  time.  Local 
concurrent  error  correction  (LCEC)  can  correct  errors  detected  by  an  LCED  procedure,  using 
another  locality  of  the  currently  accessed  node  (not  necessarily  the  same  as  that  used  by  the  LCED 
procedure).  If  the  size  of  the  locality  is  again  constant,  then  an  LCEC  procedure  will  run  in  0(1) 
time.  Error  detection  and  correction  typically  degrade  system  performance.  The  degradation  is  a 
function  of  the  number  of  nodes  accessed,  the  number  of  nodes  stored,  and  the  computation 
required,  for  detection  and  correction.  For  the  LCED  procedures  analyzed  here,  no  extra  node 
accesses  are  required  (except  in  the  initialization  phase).  Hence,  the  storage  and  computation 
requirements  dominate  the  cost  of  error  detection  and  correction. 

Linked  data  structures  may  be  modeled  as  directed  graphs.  A  graph  G  =  (N,  E)  consists  of  a 
finite  set  of  nodes  N  =  {N^  N2.  •  •  •  .  Na}  and  a  finite  set  of  edges  E  =  {Et.  E^.  •  •  •  ,  Em}.  Each  edge 
E,  =  <Nj.  Nk>  links  a  pair  of  ordered  nodes  in  this  directed  graph  (digraph).  In  the  digraph 
representation  of  a  linked  data  structure,  the  nodes  represent  the  data  records,  and  the  edges 
represent  the  pointers  between  the  records.  If  all  the  nodes  consist  of  the  same  fields,  then  the  data 
structure  is  said  to  be  uniform.  A  move  from  a  node  Nj  to  a  node  Nk  is  possible  if  there  exists  an 
edge  Ej  between  them,  and  is  represented  as  Nj-»Nk.  Then  Nk  is  reached  from  Nj  by  following  E,.  A 
traversal  is  a  series  of  moves  starting  at  a  root  node  or  header  of  a  structure  that  accesses  part  or  all 
of  the  data  structure. 

An  LCED  procedure  is  invoked  to  detect  structural  errors  whenever  a  move  attempts  to  fol¬ 
low  a  pointer,  which  may  be  a  forward  pointer,  a  backward  pointer,  or  a  virtual  backpointer  (Sec¬ 
tion  III).  That  is,  the  LCED  procedure  attempts  to  verify  the  move.  Thus,  it  is  on-line,  or  con¬ 
current  with  normal  structure  access. 
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The  errors  considered  in  this  paper  are  those  that  affect  the  structural  information  of  the  data 
structure  (e.g..  pointer  values,  structural  checking  information).  The  probability  of  an  erroneous 
pointer  to  a  random  location  remaining  undetected  by  the  techniques  presented  in  this  paper  is  pro- 
portional  to  2  .  where  b  is  the  number  of  bits  used  to  represent  a  pointer,  and  d  is  the  number  of 

erroneous  pointers  required  for  masking.  Since  this  probability  is  very  low,  the  error  detection 
analysis  concentrates  on  the  case  where  erroneous  pointers  point  to  other  nodes  of  the  same  type. 
This  kind  of  error  may  occur  in  partially  or  incorrectly  updated  data  structures,  or  as  a  result  of 
software  errors  or  hardware  failures.  These  erroneous  pointers  may  or  may  not  coincide  with  logi¬ 
cal  pointer  boundaries:  however,  the  routine  that  accesses  nodes  from  slow  memory  can  detect 
these  boundary  errors  and  supply  this  information  to  the  LCED  procedure. 

Memory  subsystems  are  commonly  configured  hierarchically,  and  the  ratio  of  the  access  time 
of  slower  memory  (used  to  store  the  data  structure,  e.g.,  MOS  RAM.  disk)  to  that  of  faster 
memory  (used  to  buffer  the  currently  accessed  nodes,  e.g.,  cache,  register  file)  is  usually  very  large. 
Hence  it  is  desirable  to  have  all  the  nodes  in  the  LCED  or  LCEC  localities  stored  in  the  fastest 
memory.  In  the  remainder  of  this  paper.  A,  will  represent  the  address  of  a  node  N,  in  a  linked  data 
structure.  N,  may  have  many  pointers  to  other  nodes,  and  a  desired  move  MV  from  N,  will  be 
represented  as  Ni~*Nmv. 

Definition  1:  Rc  is  a  fast  memory  of  capacity  c  nodes,  which  holds  the  c  most  recently 
accessed  nodes,  including  the  node  reached  by  the  current  move  MV.  Since  a  move  is  performed 
between  two  nodes,  c  must  be  at  least  two  to  verify  the  move.  That  is,  for  a  move  MV  Nj-*NMV, 
Rc  holds  both  N,  and  NMV.  If  c  =  1  then  only  could  be  stored,  and  the  information  of  the 
source  node  Nt  (e.g.,  address,  pointer  value)  would  be  lost.  Thus,  an  erroneous  move  would  be 
indistinguishable  from  a  correct  move.  □ 

The  LCED  procedure  requires  a  set  of  c  nodes  to  verify  the  move  MV.  This  set  of  nodes  is 
called  a  Checking  Window.  The  cost  of  a  Checking  Window  is  proportional  to  c.  since  it  involves 
storing  the  required  nodes  in  the  fast  memory  (storage  cost)  and  performing  checks  on  those  nodes 
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(computation  cost).  The  nodes  in  the  Checking  Window  need  not  be  re-accessed  from  slow 
memory,  since  they  are  already  stored  in  Rc . 

DEFINITION  2:  Let  a  set  of  Checking  Windows  of  size  c.  W4 .  be  defined  recursively  as: 
W4  =  {Wj  1  U  Nk}  where  Wj  1  is  the  jth  Checking  Window  of  W4  1  (1  <  j  ^  |  W4  *|  )  and  Nk  £ 
Wj'-1  is  adjacent  to  one  of  the  nodes  in  W4-1.  The  base  case  is  W2  =  {{N,  NMV}}.  □ 

W^.  for  some  m.  is  constructed  by  adding  one  more  node  Nk  to  the  smaller  Checking  Window 
W4-1,  such  that  Nk  can  be  reached  from  Wj4-1  in  one  move.  All  such  W^  form  a  set  of  sets.  W4 .  It 
will  be  shown  that  Checking  Windows  of  the  same  size  do  not  necessarily  achieve  the  same  detecta¬ 
bility.  When  the  context  is  clear,  we  may  use  W4  to  represent  one  particular  Wf. 

Example  1:  Consider  a  forward  move  Ni-»N1.fl  in  a  normal  double-linked  list  (Figure  1): 

Wi2*{N|.Ni+1} 

w2  =  {W2}  =  {{N4.  Nm}} 

w f  *  {N,.  Nl+1.  Nl+2} 

W23-{Nw.  Nj.  N1+1} 
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Figure  1.  Checking  Windows  for  a  Double-Linked  List. 
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W3  =  {W3,  W23}  =  {{Nj.  Nl+1.  Ni+2}.  {Nw.  Nj.  Ni+1}} 


The  Lock  and  Key  concept  is  now  introduced  as  a  generalization  of  structural  checking  infor¬ 


mation  that  is  distributed  throughout  the  nodes  of  linked  data  structures  (distributed  checks).  In 


the  simplest  case,  nodes  in  the  structure  will  have  associated  with  them  a  Key.  When  performing  a 


move  from  a  node  to  its  child,  the  node’s  Key  becomes  an  argument  to  the  child's  Lock  function. 


which  either  returns  "True,"  signaling  a  valid  move,  or  "False,"  signaling  error  detection.  In  its  most 


general  form,  the  Lock  and  Key  concept  allows  for  multiple-Key  Locks  and  Keys  distributed  over 


potentially  many  nod  's. 


Definition  3:  A  Key  is  information  associated  with  a  node  (e.g..  its  address,  a  pointer,  or  dis¬ 


tributed  check)  that  is  used  by  a  checking  function  to  verify  a  move. 


Definition^  A  Lock.  LockMV,  is  a  checking  function  that  verifies  a  move,  such  that 


LockMV(Keyx.  •  •  •  ,  Keyk)  =  "True"  if  all  its  Key,  arguments  are  present  and  correct.  "False"  if  all 
its  Key,  arguments  are  present  and  not  all  are  correct,  or  "X"  (don't  care)  if  not  all  its  Key,  are 


present.  A  Lock  whose  Key  arguments  are  all  present  is  called  a  checkable  Lock,  otherwise  the 


Lock  is  an  xmcheckable  Lock. 


The  computational  overhead  to  evaluate  the  checkable  Locks  is  0(1)  if  all  LockMV  are  defined 


on  Keys  that  can  be  contained  in  a  fixed-size  Checking  Window  Wf.  No  storage  overhead  is  neces¬ 


sary  because  Locks  are  functions  and  are  not  stored,  and  Keys  can  be  information  that  is  already 


present  in  the  node.  e.g..  pointers. 


DEFINITION  5:  A  Circular  Lock.  CLock-  .  is  a  Lock  f unction  whose  Keys  are  addresses  of 

i  ‘a 


nodes: 


Keys  =  <At.  Ak> 


CLockN)_N^(;r  .  y  )  =  (x  ?=  g(y )) 


V* 

where  —  is  a  pointer  (e.g.,  a  forward  pointer,  a  backward  pointer,  a  virtual  backpointer)  of  N,  to 
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Nk,  g  is  a  function  that  generates  x  using  a  series  of  pointers,  and  ?=  represents  a  comparison  that 


returns  either  "True"  or  "False"  for  a  checkable  CLock. 


Circular  Locks  possess  the  property  that  for  all  starting  nodes  Nj.  any  single  pointer  error 
encountered  in  the  moves  of  g  causes  the  Lock  to  evaluate  to  "False."  The  following  two  examples 
show  that  the  double- linked  list  and  a  binary  tree  with  signatured  access  paths  employ  Locks  and 
Keys.  The  double-linked  list  uses  a  Circular  Lock  checking  function,  while  the  tree  with  signa¬ 
tured  access  paths  uses  a  Lock  defined  on  O(height-of-tree )  Keys. 

EXAMPLE  2:  Let  N0.  Nlt  •  •  •  .  Nn  be  the  nodes  of  a  double-linked  list.  Let  a  node  Nj  have  a 
forward  pointer  Pj  and  a  backpointer  B,.  For  a  forward  move  Nj-*Ni+1: 


Keys  =  <  A,,  Ai+1> 

CLockN)_N^U ,  y )  =  (x  ?=  g(y  ))  =  (x  ?=  y  .B). 

The  backpointers  are  the  distributed  checks,  and  the  g  function  in  the  Circular  Lock  retrieves  the 
backpointer  B  from  the  node  at  y.  This  structure  achieves  O(l)  single  pointer  error  detection  in 
Checking  Window  W*  (c/.  Example  1).  □ 

EXAMPLE  3:  In  the  signatured  access  path  technique,  signatures  defined  over  the  nodes  of  valid 
traversal  paths  are  embedded  at  path  termination  points,  where  a  traversal  path  starts  at  a  header 
and  ends  at  a  leaf,  for  a  binary  tree  [14].  Error  detection  is  achieved  by  comparing  signatures  gen¬ 
erated  at  traversal  time  with  the  embedded  signatures.  A  simple  signature  is  the  logical  exclusive- 
or  function  (©)  of  all  the  pointers  in  the  valid  traversal  path. 


Keys  =  < ordered  set  of  pointers  in  a  valid  traversal  path,  signature > 
Lockforvtrd(pj.  •  ■  •  ,  pk.  signature)  -  (p;©  •  •  ■  ©pk©signature  ?=  0). 


The  nodes'  pointers  are  the  distributed  checks.  This  structure  cannot  guarantee  0(1)  detection  time 
as  Q{height-of-tree )  nodes  may  be  accessed  in  the  traversal  path.  G 
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We  now  determine  the  minimum  number  of  errors  that  are  required  to  cause  the  checkable 
Locks  used  by  the  LCED  procedure  to  evaluate  to  "True"  in  a  particular  Checking  Window.  This  is 
similar  to  the  changes  used  by  Taylor.  Morgan  and  Black  [15]  to  determine  the  distance  between 
two  data  structure  instances.  The  difference  here  is  that  the  distance  is  measured  within  a  Check¬ 
ing  Window.  Hence  this  new  distance  is  termed  local  distance,  from  which  the  definition  of  local 
concurrent  error  detectability  follows  directly.  Let  LockMV  be  defined,  for  every  possible  move 
MV  in  a  specific  data  structure,  over  Keys  distributed  in  nodes  contained  in  a  fixed-size  Checking 
Window. 

Definition  6:  The  local  distance.  d'CMV).  within  a  Checking  Window  of  size  c  is  defined  as 
the  minimum  number  of  pointer  errors  in  ail  WjC  that  can  mask  a  move  to  an  incorrect  node,  due  to 
a  pointer  error,  where  MV  is  the  move  to  the  correct  node.  Errors  are  not  detectable  if  all  check¬ 
able  LockMV  evaluate  to  "True."  □ 

Definition  7:  The  local  concurrent  error  detectability.  Dc( MV),  for  a  specified  move  MV  and 
Checking  Window  of  size  c  is  given  by: 

Dc (MV)  =  max(dj(MV))  -  1. 1  <  j  <  |  W*| .  □ 

The  max  function  is  used  because,  for  a  specified  move,  it  is  always  possible  to  find  a  Check¬ 
ing  Window  W'  which  can  detect  at  least  De  simultaneous  errors  (including  the  pointer  from  N;  to 

that  is  erroneous).  When  the  context  is  clear,  we  may  omit  the  parameter  MV  in  djC(MV)  or 
DC(MV). 

The  following  theorem  will  be  used  to  prove  that  the  local  concurrent  error  detectability  of 
data  structures  employing  the  virtual  backpointer  is  the  same  for  both  forward  and  backward 
moves. 

THEOREM  li  In  a  uniform  data  structure,  if  for  every  pointer  of  the  form  N,-*Nk  there  exists 
a  —  pointer  to  reach  N,  from  Nk  in  one  move,  and  the  Lock  functions  are  Circular  Locks,  then 
using  an  LCED  procedure,  Dc(Ni“*Nk)  =  De(Nk-~N,)  *  Dc. 
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PROOF:  Since  the  data  structure  is  uniform.  Nj-*Nk  and  Nk~-N(  represent  all  possible  forward 
and  backward  moves,  respectively.  Notice  that  W,  *  {N,.  Nk).  Thus,  all  Wj  are  also  the  same  for 
both  moves  as  WjC  is  defined  on  wj.  If  Nt-*Nk  is  erroneously  changed  to  N,-»Nk..  it  is  isomorphic 
to  the  case  Nk~N,  being  changed  to  Nk~Nr.  because  the  pointers  used  in  the  g  function  of  the  Cir¬ 
cular  Lock  are  not  changed  by  the  isomorphism.  In  both  cases,  the  Locks  evaluate  to  the  same 
value  because  the  accessible  nodes  in  W*  are  the  same.  By  Definition  6.  d'(N|-*Nk)  =  dJc(Nk~Ni). 
Hence  Dc  (N,-Nk)  =  De  (Nk— N,)  -  De .  □ 

Theorem  2  will  be  used  in  determining  the  upper  bounds  of  local  concurrent  error  detectabil¬ 
ity  for  the  Virtual  Double-Linked  List  and  B-Tree  with  Virtual  Backpointers. 

THEOREM  2:  Local  concurrent  error  detectability  is  a  monotonically  increasing  function  of 
window  size  c.  That  is.  De_1  ^  Dc  ^  D*  for  3  ^  c  <  n.  where  n  is  the  total  number  of  nodes  in 
the  data  structure. 

PROOF:  Every  W^  is  constructed  by  adding  one  adjacent  node  Nk  to  a  Checking  Window  of 
size  c-1:  *  Wj1-1  U  Nk.  If  each  checkable  Lock  in  Wj-1  evaluates  to  "True*  in  W*-1  then  it 

will  remain  "True"  in  because  the  Keys  of  the  Lock  are  contained  in  both  W‘  1  and  W^.  If  the 
addition  of  Nk  causes  an  uncheckable  Lock  in  W*~l  to  evaluate  to  "True*  or  "X"  in  W^.  this  results 
in  d£,  *  d*-1.  However,  if  the  uncheckable  Lock  evaluates  to  "False."  then  dem  >  dj  .  since  at  least 
one  other  error  would  be  required  to  mask  the  detected  error.  Hence,  d^  ^  djC~  .  Then  max(d^)  ^ 
max(djC_1).  and  Dc  ^  Dc_1  follows  from  Definition  7.  The  upper  limit  of  detectability  is  trivially 
D* .  since  the  entire  structure  is  then  included  in  the  Checking  Window.  □ 

If  the  Checking  Window  includes  all  the  nodes  of  the  structure,  LCED  procedure  degenerates 
into  a  global  error  detection  procedure,  which  requires  (Xn)  execution  time.  Therefore,  to  achieve 
maximum  local  concurrent  error  detectability,  it  is  sufficient  to  use  a  We  with  minimum  size  c  for 
which  Dc  ■  D" . 

The  LCED  procedures  mentioned  throughout  this  section  were  unspecified  because  the  actual 
procedure  used  depends  on  the  particular  data  structure  to  be  checked.  The  general  LCED 
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technique  is  as  follows.  First,  determine  the  appropriate  Checking  Window  WjC  that  achieves  the 
desired  local  concurrent  error  detectability.  For  each  possible  move  from  each  node,  identify  the 
Lock  functions  and  associated  Key  arguments  that  are  used  to  perform  the  checking.  The  LCED 
procedure  can  be  constructed  as  follows:  for  each  move  made,  access  the  nodes  defined  by  the 
Checking  Window,  and  evaluate  all  the  checkable  Lock  functions.  If  all  Locks  return  "True."  then 
either  no  error  has  occurred  or  undetectable  errors  have  occurred;  if  any  Lock  returns  "False,"  then 
at  least  one  error  has  been  detected.  Once  an  error  has  been  detected  by  an  LCED  procedure.  LCEC 


may  be  performed.  The  upper  limit  of  correctability  is 


However,  the  actual  correctability 


depends  upon  the  data  structure. 


Since  errors  are  detected  and  corrected  based  only  on  information  from  nodes  in  the  Checking 
Window,  many  other  detectable  errors  may  exist  simultaneously  throughout  the  data  structure. 
Although  the  local  concurrent  error  detectability  and  correctability  may  only  be  one  or  two  in  the 
window,  the  actual  number  of  detectable  and  correctable  errors  may  be  much  larger. 


III.  VIRTUAL  BACKPOINTERS 

The  virtual  backpointer  is  a  distributed  checking  symbol  that  can  be  used  to  achieve  0(1) 
LCED  and  0(1)  LCEC  during  a  forward  move,  and  0(1)  LCED  and  0(n)  LCEC  during  a  backward 
move  in  many  linked  data  structures.  In  addition,  it  can  be  used  to  generate  a  backpointer  from  a 
node  N|  to  its  parent  In  the  general  case,  a  virtual  backpointer  may  point  to  an  ancestor 

NU[Mlof  of  a  node  N,.  where  Ntnfw1,r  is  an  ancestor  of  N,  if  there  exists  a  series  of  moves  from 

•^tnewrior 

Definition  8:  In  a  linked  data  structure,  let  N,ne—Vjr  be  an  ancestor  of  N,.  and  Q,  be  the  set  of 
all  pointers  in  N,.  The  virtual  backpointer  V(  *  f(Q,.  Aloctttor).  where  f  is  a  function  such  that 
\aetM  ■  f  (Q[,  V^)  ■  f  (Qj.  f (Qj.  Ajoe^of)).  and  f  is  a  companion  function  determined  by  f.  In 


11 


general,  there  may  be  vectors  of  virtual  backpointers.  =  ?(Q;,  A),  which,  after  suitable  transfor¬ 
mation  by  T,  point  to  vectors  of  nodes  A.  □ 


The  virtual  backpointer  has  the  following  properties.  1)  For  a  forward  move  Nj-*Nl+l.  Vi+1 
provides  checking  information.  2)  For  a  backward  move  N1+1— N,.  Vl+1  provides  the  backpointer 


after  transformation  by  f  .  and  Qulcmor  is  used  as  checking  information.  Two  example  data  struc¬ 


tures  employing  the  virtual  backpointer  are  presented  in  the  following  subsections:  the  Virtual 


Double-Linked  List,  which  is  derived  from  the  double-linked  list,  and  the  B-Tree  with  Virtual 


Backpointers,  which  is  derived  from  the  B-tree. 


A.  Virtual  Double-Linked  List 


The  Virtual  Double-Linked  List  (VDLL)  is  a  data  structure  that  employs  the  virtual  back¬ 


pointer  and  possesses  local  concurrent  error  detectability  and  correctability.  Errors  are  detected  in 


0(1)  time  with  an  LCED  procedure.  For  a  forward  move,  detected  errors  may  be  corrected  using 
LCEC  in  0(1)  time:  for  a  backward  move,  detected  errors  may  be  corrected  using  LCEC  in  O (n) 
time.  The  VDLL  requires  no  more  storage  space  than  the  double-linked  list  (DLL),  and  retains  the 


simplicity  of  the  DLL.  in  that  it  is  possible  to  move  directly  from  a  node  to  its  parent,  using  the 


virtual  backpointer.  This  is  not  possible,  for  example,  in  the  modified(fc)  DLL  [l],  for  k  >  2,  which 


must  access  other  ancestors  of  a  node  in  order  to  reach  the  node's  parent. 


DEFINITION  9:  A  Virtual  Double-Linked  List  is  described  as  follows  (Figure  2).  In  a  linked 


list  data  structure,  let  be  the  parent  of  N,.  and  P(  be  the  forward  pointer  of  the  Nj,  therefore 


Q,  =  {P,}.  Let  f({x}.  y)  =  f  ({x}.  y)  =  x©y.  then  V4  =  P^A,.^  =  A^SA,^.  and  At-1  =  P^Vj.  where 


©  denotes  the  logical  exclusive-or  function.  Also,  c  header  nodes  N0,  N_t  •  •  •  ,N_e+1  are  added. 


where  c  is  the  size  of  the  Checking  Window.  These  header  nodes  are  assumed  to  be  always  accessi¬ 


ble  by  the  LCED  procedure.  Note  that  N_,.+l  =  N„ 


The  VDLL  is  created  from  the  DLL  by  replacing  its  backpointers  with  virtual  backpointers. 


The  same  operation  can  be  applied  to  the  modihed(£)  DLL  family  [l],  resulting  in  the  modihed(^) 


Figure  2.  Virtual  Double-Linked  List  (VDLL)  of  5  nodes. 

VDLL  structures.  It  will  be  shown  that  each  modifiedGfc)  VDLL  achieves  greater  local  concurrent 
error  detectability  than  the  corresponding  modified(fc)  DLL. 

DEFINITION  10s  A  modified (k )  Virtual  Double-Linked  List  is  described  as  follows.  In  a  linked 
list  data  structure,  let  N1-k  be  the  kth  ancestor  of  Nj.  and  Pt  be  the  forward  pointer  of  the  N,.  there¬ 
fore  Qi  =  {P(}.  Let  f(x.  y)  =*  f*({x}.  y)  =  x®y,  then  V,  =  P,©A,_k  =  A,^1©A4_k.  and  A,_k  =  P^Vj. 
Also.  max(k+l.  c)  header  nodes,  are  added.  □ 

The  possible  Locks  and  Keys  of  the  VDLL  can  be  identified  as  follows  (Figure  2).  For  a  for¬ 
ward  move  N,-*N1+,  following  P,. 

Keys  =  <At.  Pi+1©Vi+l> 

CLockN)_N^(.r .  y )  =  (x  ?=  g(y  ))  =  (x  ?=  y  ). 


where  g  is  the  identity  function.  For  the  backward  move  N1+l~-Nt  following  Vi+1©Pi+1. 

Keys  —  <A1+l.  A(> 

CLockN^_N (x , y )  =  (x  ?=  g(y))  =  (x  ?=  y.P). 

where  g  retrieves  the  pointer  P  from  the  node  at  y.  Locks  and  Keys  for  the  modified(i)  VDLL  can 
be  identified  similarly.  Using  the  results  of  the  analysis  of  LCED,  we  now  determine  the  local  con¬ 
current  error  detectability  of  the  VDLL. 

THEOREM  3:  Using  an  LCED  procedure,  the  local  concurrent  error  detectability  of  the  VDLL 
is  D2(forward)  =  D2(  backward)  =  D2  =  1,  and  Dtf  (forward)  =  Dc  (backward)  =*  Dc  =  D3  *  2,  V  c 
>  3. 

PROOF:  Since  the  VDLL  uses  virtual  backpointers  and  Circular  Locks,  by  Theorem  1. 
Dc  (forward)  =  Dc  (backward).  Consider  a  forward  move  MV.  Nt-»N1+1.  following  Pt.  The  LCED 
procedure  attempts  to  verify  this  move.  A  pointer  that  does  not  point  to  a  logical  node  boundary 
can  easily  be  detected  by  the  node  access  routine.  Therefore  consider  only  erroneous  pointers  that 
lead  to  valid  logical  node  addresses.  Suppose  that  P,  is  erroneous  and  leads  to  Nj+l  instead  of  N1+1. 
In  W*  ■  {Nt.  N1+l}.  df  a  2:  either  Vj+j  or  P^j  must  be  erroneous  to  mask  the  error  in  P(.  Assume 
that  Vm  is  erroneous  (Figure  3a).  In  W2  *  {N(.  N^j,  Nj+2}.  d2  *  2.  However,  in  W2  * 
{N,_l.  Nj.  NJ+1}.  V,  will  lead  to  the  detection  of  the  error  in  Pt.  because  following  the  backpointer 
given  by  Vi©Pl  will  lead  to  a  node  Nk-1  instead  of  N,_1.  and  Pk-1  ^  N,.  Therefore.  V,  must  be 
changed  into  the  value  Aj+l©Ai_l  to  mask  the  error  in  P,.  Thus  d2  *  3. 

Assume  now  that  Vi+l  is  not  erroneous,  so  PJ+1  must  be  erroneous  (Figure  3b).  Consider 
W2  *  {N,.  Nj+l.  Nk+2}.  The  LCED  procedure  will  not  detect  the  error  in  P,  if  Pj+l  has  been  changed 
to  Ak+2  *  Aj©V^+1.  and  Vk+2©Pk+2  has  been  changed  (via  a  change  in  either  Vk+2  or  Pk+2)  to  AJ+1. 
The  remainder  of  the  analysis  is  similar  to  the  case  above,  and  gives  df  =  2,  df  =  3.  and  d2  =  3. 
According  to  Definition  7.  D2  *  1  and  D2  *  2.  Since  the  VDLL  can  be  changed  to  another  correct 
VDLL  by  three  pointer  errors  (node  deletion),  D"  *  2.  where  n  is  the  number  of  nodes  in  the  struc- 
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ture.  By  Theorem  2,  De  =  2.  V  c  ^  3.  □ 

The  above  proof  suggests  that  when  moving  forward  N,-*NMV  following  P,.  use 
W3-{Npw.N,.Nmv}  as  the  Checking  Window,  where  Np^  corresponds  to  in  the  proof:  and 
when  moving  backward  N1~-NMV  following  PjCV,.  use  W3  =  {N,.  NMV.  Nntat}  as  the  Checking  Win¬ 
dow,  where  N,^  is  the  node  reached  by  following  PMV©VMy  By  using  these  windows,  double 
pointer  errors  can  be  detected,  or  single  pointer  errors  corrected  (described  below).  The  LCED  pro¬ 
cedure  using  this  Checking  Window  evaluates  four  locks  when  moving  either  forward  or  back¬ 
ward.  For  a  forward  move,  the  locks  are:  LI:  Ap^*  ?=  P.QVj.  L2:  A,  ?=  Pmv^V^.  L3:  A,  ?=  Pp^ 
and  L4:  ?*  P,.  For  a  backward  move,  the  locks  are:  LI:  Aanrt  ?*  L2: 

Amv  ?=  P^V,  L3:  Amv  ?=  P^,  and  L4:  A,  ?=  PMV.  (In  the  W2  Checking  Window,  only  two  locks 
are  evaluated,  namely  At  ?=*  Pmv®Vmv  and  A^  ?=  Ps  for  the  forward  move,  and  AMV  ?*  Pi©Vl 
and  At  ?»  PMV  for  the  backward  move.)  A  comparison  of  local  concurrent  error  detectability  is 
given  in  Table  1  for  the  VDLL.  modified(2)  •VDLL,  modified(3)  VDLL.  DLL  without  a  global 
count,  and  modified(2)  and  modihed(3)  DLL  without  global  counts  [l],  for  various  sized  Checking 
Windows.  The  local  detectability  of  the  modified(2)  and  modified(3)  VDLL  can  be  obtained  using 
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Table  1.  Local  Concurrent  Error  Detectability 
of  Several  Linked  List  Data  Structures- 
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the  same  analysis  technique  as  that  applied  to  VDLL.  Any  modified(k)  VDLL  achieves  greater 
local  concurrent  error  detectability  than  the  corresponding  modified(k)  DLL.  For  k  >  3.  no  further 
improvement  in  detectability  can  be  made  for  either  of  the  two  families. 

Theorem  4:  Any  single  pointer  error  detected  by  a  forward  move  in  W3  =  {Np,,,.  N,.  NMV}  in 
a  VDLL  can  be  corrected  with  an  CXl)  LCEC  procedure  requiring  at  most  one  extra  node  access  for 
both  diagnosis  and  correction.  Any  single  pointer  error  detected  by  a  backward  move  in 
W3  =  (N,^.  Nmv.  Nt}  in  a  VDLL  can  be  corrected  with  an  O (n)  LCEC  procedure  requiring  at  most 
one  extra  node  access  for  diagnosis. 

Proof:  Since  the  local  concurrent  error  detectability  for  this  structure  using  W3  is  D3  =  2. 
the  upper  limit  of  correctability  is  1.  Assume  that  a  single  error  has  been  detected  during  a  for¬ 
ward  move.  The  LCED  procedure  supplies  the  values  of  the  four  detection  locks  (Table  2a).  and 
three  error  indication  values  generated  by  a  node  access  routine.  NApw.  NA,.  NAyy.  that  indicate 
out-of-bounds  pointers  or  pointers  that  do  not  point  to  logical  node  boundaries,  when  used  to 
access  N^.  Nt  and  Nyy.  respectively.  There  are  eight  possible  errors:  1)  Ap,,,  error.  2)  Pp^  error. 
3)  A,  error.  4)  P,  error.  5)  V,  error.  6)  Ayy  error.  7)  Pyy  error  and  8)  VMV  error.  To  distinguish 
the  eight  errors,  the  seven-tuple  syndrome  {LI.  L2.  L3.  L4.  NAp^.  NA}.  NAyy}  is  constructed 
(Table  2b).  For  the  error-free  case,  the  syndrome  will  be  {True.  True.  True,  True.  True.  True, 
True}.  There  are  two  cases  of  identical  syndromes  for  different  errors.  In  each  case  one  extra  node 
is  accessed  to  completely  diagnose  the  error.  Nx  is  accessed  by  following  PMV  to  distinguish  a  PMV 
error  from  a  VMV  error.  Ny  is  accessed  by  following  Pl©Vl  to  distinguish  an  Ap,.,,  error  from  a  Vs 
error.  Once  the  error  has  been  diagnosed,  correction  proceeds  as  follows: 

1)  Ap^  error:  correct  value  is  P^V,. 

2)  Pp^  error:  correct  value  is  A,. 

3)  A,  error:  correct  value  is  Pp,,,. 

4)  Pj  error:  correct  value  is  Ayy. 


Table  2a.  Detection  and  Diagnosis  Locks  for  Forward  Moves 
in  the  VDLL  using  W3. 
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Detection  Locks 

LI 

A0rrr  ?=  P0V, 

L2 

Ai  ?B  PMv®vvrv 

L3 

A(  7—  Purer 

L4 

Amv  Pi _ 

Diagnosis  Locks 

L 5 

Amv  ?=  Px®V* 

Access  N*  via  PMv 

L6 

Ai  ?=  pv _ 

Access  Nv  via  Pt©V, 

Table  2b.  Error  Detection  and  Diagnosis  Syndromes  for  Errors  Detected 
by  Forward  Moves  in  the  VDLL  using  W3. 


error 

LI 

L2 

L3 

L4 

NA — 

NA, 

NAvn, 

L5 

L6 

A^, 

F 

T 

T 

T 

T 

T 

T 

- 

T 

Purer 

T 

T 

F 

T 

T 

T 

T 

- 

- 

Ai 

T 

F 

F 

T 

T 

T 

T 

- 

- 

P, 

F 

T 

T 

T 

T 

T 

F 

- 

- 

F 

F 

T 

T 

T 

T 

T 

- 

- 

F 

F 

T 

T 

T 

T 

F 

- 

- 

F 

T 

T 

T 

T 

T 

T 

- 

F 

amv 

T 

T 

T 

F 

T 

T 

T 

- 

- 

pmv 

T 

F 

T 

T 

T 

T 

T 

F 

- 

Vmy _ 

T 

F 

T 

T 

T 

T 

T 

T 

- 

5)  V;  error:  correct  value  is  Ap^CP,. 

6)  Amv  error:  correct  value  is  P4. 

7)  Pw  error:  correct  value  is  A1©Vmv. 

8)  VMV  error:  correct  value  is  A^P^. 

Assume  now  that  a  single  error  has  been  detected  during  a  backward  move.  The  LCED  pro¬ 
cedure  supplies  the  values  of  the  four  detection  locks  (Table  3a),  and  three  error  indication  values 


I 

I 


generated  by  a  node  access  routine,  NAnext.  NAMV.  NAt.  that  indicate  out-of-bounds  pointers  or 
pointers  that  do  not  point  to  logical  node  boundaries,  when  used  to  access  N0<xt,  NMV  and  Nj, 
respectively.  There  are  eight  possible  errors:  1)  Anext  error.  2)  Pn,xt  error.  3)  error.  4)  PMV 
error,  5)  V,^  error,  6)  A,  error.  7)  P1  error  and  8)  V(  error.  To  distinguish  the  eight  errors,  the 
seven-tuple  syndrome  {LI.  L2.  L3.  L4.  NAntxt.  NAMV.  NA,}  is  constructed  (Table  3b).  For  the 
error-free  case,  the  syndrome  will  be  {True.  True,  True,  True,  True.  True.  True}.  There  are  two 
cases  of  identical  syndromes  for  different  errors.  In  each  case  one  extra  node  is  accessed  to 


Table  3a.  Detection  and  Diagnosis  Locks  for  Backward  Moves 
in  the  VDLL  using  W3. 


Table  3b.  Error  Detection  and  Diagnosis  Syndromes  for  Errors  Detected 
by  Backward  Moves  in  the  VDLL  using  W3. 
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completely  diagnose  the  error.  Nx  is  accessed  by  following  PQcxt  to  distinguish  a  PMXt  error  from  a 
Vmv  error.  NY  is  accessed  by  following  Pt  to  distinguish  a  Pj  error  from  a  V,  error.  Once  the  error 
has  been  diagnosed,  correction  proceeds  as  follows: 


s 

v 

i) 

Aatxt  error:  correct  value  is  Pmv®^mv 

r  " 

2) 

Pntxt  error:  correct  value  is  AMV. 

3) 

Amv  error:  correct  value  is  Pntxt. 

4) 

PMV  error:  correct  value  is  Aj. 

H 

y 

5) 

VMV  error:  To  correct  the  error  in  VMV.  first  access  the  headers  of  the  struc¬ 
ture.  Next,  move  forward,  accessing  nodes  N0,  Nx.  •  •  •  .  Nk,  performing  W3 

LCED  and  correcting  single  errors  with  0(l)  LCEC.  until  Pk  —  AMV.  Then  the 

correct  value  of  VMV  =  Ak©P>MV. 

6)  A,  error:  correct  value  is  PiMV- 

7)  Pi  error:  correct  value  is  AMV®Vi. 

8)  Vt  error:  correct  value  is  Amv©?,.  □ 

Note  that  for  a  forward  move,  both  diagnosis  and  correction  are  0(1)  time,  and  require  one 
extra  node  access.  For  a  backward  move,  diagnosis  is  0(1)  time  Cone  extra  node  access)  but  correc¬ 
tion  requires  O (n)  extra  node  accesses  in  the  worst  case.  Thus,  0(1)  LCEC  is  possible  for  an  error 
detected  by  a  forward  move,  while  0(n)  LCEC  is  possible  for  an  error  detected  by  a  backward 
move.  The  proof  assumed  that  W3  LCED  was  used:  if  W2  is  used  instead,  then  diagnosis  for  both 
the  forward  and  backward  moves  is  still  0(1).  but  correction  for  both  moves  requires  0(n)  LCEC. 


B.  B-Tree  with  Virtual  Backpointers 

The  B-Tree  with  Virtual  Backpointers  (VBT)  of  order  m  is  a  data  structure  that  possesses  local 
concurrent  error  detectability  and  correctability.  Errors  are  detected  in  0(1)  time  if  the  time  com¬ 
plexity  is  measured  as  a  function  of  the  number  of  nodes  in  the  tree,  i.e..  n.  For  a  forward  move. 
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detected  errors  can  be  corrected  using  0(1)  LCEC;  for  a  backward  move,  detected  errors  can  be 
corrected  using  OClog^n)  LCEC.  The  VBT  requires  m+4  extra  fields  in  each  node,  and  has  the 
additional  feature  that  backward  traversal  can  be  performed  without  a  stack,  using  the  virtual 
backpointer. 

The  underlying  structure  of  the  VBT  is  the  B-tree  of  order  m  [16],  which  finds  application  in 
the  construction  and  maintenance  of  large-scale  search  trees.  The  B-tree  has  the  following  charac¬ 
teristics: 

1)  Every  node  contains  at  most  2m  keys,  and  every  node  except  the  root  contains 
at  least  m  keys.  The  root  contains  at  least  one  key. 

2)  Every  node  is  either  a  leaf  node,  with  no  pointers  to  other  nodes,  or  an  internal 
node,  with  pointers  to  other  internal  nodes  or  to  leaf  nodes. 
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3)  All  leaf  nodes  appear  at  the  same  level. 

4)  An  internal  node  with  k  keys  will  have  k+1  pointers  to  subtrees.  The  k  keys 
will  be  arranged  in  strictly  increasing  order,  and  keys  in  the  ith  subtree  will  be 
less  than  the  ith  key.  while  keys  in  the  i+lth  subtree  will  be  greater  than  the  ith 
key. 

Let  P(i  be  the  jlh  pointer  in  node  N,.  Assume  that  each  pointer  requires  one  word  of  memory. 
Therefore,  each  pointer  is  uniquely  addressable  by  At j  (Figure  4a).  The  VBT  is  modified  from  the 
B-tree  in  the  following  ways  to  achieve  local  concurrent  error  detectability. 

1)  A  header  node  N0  is  created  with  P0  j  =  Aj  j  for  0  ^  j  <  2m. 

2)  Vif  the  virtual  backpointer  of  N,.  is  defined  as  V,  =  Pi0©Pu© 
®Pi.2m®Ap®r«i»t.j  where  the  jth  pointer  in  Np,,,,,,  points  to  N,.  For  the  special  case 
of  the  virtual  backpointer  from  the  root  to  the  header.  Vt  is  defined  on  A^. 
even  though  all  P0  j  point  to  Nr 
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3)  The  keys  of  Nj  (i.e.,  Ku.  K^,  •  •  •  .  Ki>2m)  are  arranged  in  a  matrix  (Figure  4b) 
and  the  key  check  symbols  Xy  and  Yy  are  generated  using  a  product  code  [17] 
as  follows: 

Xy  “  ®  Xi.(>-l)m+2  ®  '  ‘  ‘  ®  Xi.(j-l)m+m  .  1  ^  j  ^  2 

Yy  m  Kt  j  ©  Ki/n+j  .  1  ^  j  ^  m. 

K, j  is  used  to  determine  Xiint((j_1)/m)+1  and  Y,  .(j—  1)  mod  m  +  l*  CaH*d  its 
corresponding  X  and  Y  check  symbols,  respectively. 

The  number  of  key  fields  used  in  N|  is  called  county  which  is  added  for  performance  enhance¬ 
ment.  A  VBT  of  order  2  is  illustrated  in  Figure  4c.  The  possible  Locks  and  Keys  of  the  VBT  can  be 
identified  as  follows.  Assuming  the  jth  pointer  of  Nj  points  to  Nk,  for  a  forward  move  N1-*Nk  fol¬ 
lowing  Py. 

Keys  =  <Ay.  (Pk.0©Pk>1©  •  •  •  ©Pkim©Vk)> 

CLockN_N^(x .  y  )  =  (x  ?=  g(y  ))  =  (x  ?=  y  ). 

where  g  is  the  identity  function.  For  the  backward  move  Nk— Nt  following  (Pk0©Pkl®  •  • 

®Pk^.®vk). 


Keys  =  <Ak,  Ay> 

CLockN^-.N((x , y )  =  (x  ?=g(y))  =  (x  ?=  y.Pj). 

where  g  retrieves  the  j"1  pointer  Py  from  the  node  at  y. 

We  now  determine  the  local  concurrent  error  detectability  of  the  VBT,  employing  the  results 
of  the  analysis  of  LCED.  Using  Theorem  2,  Table  4  presents  the  possible  key  and  pointer  errors 
that  can  occur  in  the  VBT  (errors  in  the  count  field  are  covered  by  the  fifth  and  sixth  rows  of  the 
table),  and  the  number  of  errors  required  to  mask  them,  assuming  an  LCED  procedure  is  used. 

Theorem  5:  Using  an  LCED  procedure,  the  local  concurrent  error  detectability  of  the  VBT  is 
D2  =  1  and  D3  =  De  =  2.  V  c  5*  3. 
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Figure  4a.  Node  Representation  in  Order-2  B-Tree  with  Virtual  Backpointers. 
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PROOF;  From  Table  4.  the  minimum  d2  =  2  and  the  minimum  dj  =  3.  V  c  ^  3.  From 
Definition  7.  it  follows  that  D2  *  1  and  Dc  =  2.  V  c  ^  3.  □ 

From  Table  4  it  can  be  seen  that  no  increase  in  the  local  concurrent  error  detectability  can  be 
gained  by  using  W*  for  c  ^  3.  It  can  be  shown  that  when  moving  forward  Nj-^Nj^v  following  Py. 
or  when  moving  backward  N,~ NMV  following  (Pmv,o®pmv.i®  '  * '  ®Pmv>h®^mv)'  use 
W3  =  {Np^.  N[.  N^}  and  W3  =  {N,.  NMV.  Nn,xt}  respectively,  to  achieve  detection  of  double 
pointer  errors,  or  correction  of  single  pointer  errors  (described  below).  In  the  window  for  the  for¬ 
ward  move.  Np^  is  the  parent  of  Nt.  and  in  that  for  the  backward  move.  Noat  is  the  parent  of 
Nmv.  The  LCED  procedure  using  this  window  evaluates  four  locks.  For  a  forward  move,  the  locks 
are:  LI:  Ap^,^  ?=  P^oPPm®  **•  ®Pi.i»®V1.  L2:  Ay  ?=  Pmv.o®pmv.i®  ®Pmv*ii®^mv-  L3: 
A1  7=  Pp,„^  and  L4:  AMV  ?=  Py.  For  a  backward  move,  the  locks  are:  LI:  A^^  ?»  Pmv.o®pmv.i® 
®Pmv>.®Vmv.  L2:  Amv  ,  ?=  Pl0©P,  •  •  ©Pj^SVj.  L3:  A^  ?=  Pn^j  and  L4:  A,  7=  P^.t- 
(In  the  W2  Checking  Window,  only  two  locks  are  evaluated,  namely  Ay  7*  Pmv.o®pmv.i®  '  '  ’ 
©PMv^m©VMV  and  A^  ?■  Py  for  the  forward  move,  and  AMV  t  7*  Pl>0©Pu©  •  •  •  ©P1Jm©Vi  and 
A,  7*  P.Mv.t  f°r  the  backward  move). 


Table  4.  Analysis  of  Errors  in  the  VBT. 


Error  Condition 

i  jnajt(d,2)  (( 

max(d,3) 

max(dp 
Vc  >  4 

2/n+l 

2m+l 

2m+l 

Empty  VBT  becomes  non-empty 

2m+2 

2m+2 

2m+2 

Key,  X  or  Y  becomes  erroneous 

3 

3 

3 

Internal  node's  non-null  pointer  points  to  incorrect  node 

2 

3 

3 

Internal  node's  non-null  pointer  becomes  null 

6 

6 

6 

Internal  node's  null  pointer  becomes  non-null 

6 

7 

7 

Two  of  internal  node’s  pointers  exchanged 

2 

4 

4 

Internal  node  becomes  a  leaf  node 

3 

3 

3 

Leaf  node  becomes  an  internal  node 

3 

4 

5 
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THEOREM  6:  Any  single  pointer  error  detected  by  a  forward  move  in  W3  =  {Nprtr.  N,.  NMV} 
can  be  corrected  with  at  most  2m+l  extra  node  accesses  in  (Xl)  time.  Any  single  pointer  error 
detected  by  a  backward  move  in  W3  =  {N,.  NMV.  Nntxt)  can  be  corrected  in  OUog^n)  time  if  it  is 
detected  during  a  backward  move. 

Proof:  Since  the  local  concurrent  error  detectability  of  this  structure  in  W3  is  D3  =  2.  the 
upper  limit  of  correctability  is  1.  Assume  that  the  error  detected  is  a  single  error.  The  error  may 
be  a  key.  a  key  check  symbol,  a  count  or  a  pointer.  For  the  key  or  key  check  symbol  error,  diag¬ 
nosis  and  correction  are  performed  using  the  procedures  for  product  codes  [17],  For  a  count  error, 
all  the  keys  and  key  check  symbols  will  be  correct,  hence  counting  the  non-null  keys  will  regen¬ 
erate  the  count. 


I  5 


K 


ft 


*.1 


S' 


£ 


For  the  pointer  error,  if  the  erroneous  pointer  is  located  at  the  header  node,  it  can  be  corrected 
by  simple  comparison  because  there  are  2m+l  ^  3  identical  pointers  in  the  header.  Otherwise, 
there  are  two  cases:  detection  by  a  forward  move  and  detection  by  a  backward  move.  Assume  that 
the  error  has  been  detected  during  the  forward  move  from  N,  to  NMV  following  P,  j.  The  LCED 
procedure  supplies  the  values  of  the  four  detection  locks  (Table  5a).  and  three  error  indication 
values  generated  by  a  node  access  routine.  NAprfr,  NA,.  NA^,  that  indicate  out-of-bounds  pointers 
or  pointers  that  do  not  point  to  logical  pointer  boundaries,  when  used  to  access  Np,,,.  N,  and  NMV. 
respectively.  There  are  nine  possible  errors:  l)  A^  error.  2)  Pp,^,.  error  where  Pp,,,^  is  the 
pointer  from  Np,,,  to  N,.  3)  A,  error,  4)  Py  error,  5)  Pi4  error  for  0  <  s  ^  2m  and  s  ^  j.  6)  V, 
error.  7)  error,  8)  P^,  error  for  0  <  t  <  2m,  and  9)  error.  To  distinguish  the  nine 
errors,  the  seven-tuple  syndrome  {LI.  L2,  L3.  L4,  NAp^.  NA,.  NA^}  is  constructed  (Table  5b). 
For  the  error-free  case,  the  syndrome  will  be  {True,  True.  True.  True.  True,  True,  True}.  There  are 
two  cases  of  identical  syndromes  for  different  errors.  In  each  case  extra  nodes  are  accessed  to  com¬ 
pletely  diagnose  the  error.  The  nodes  Nj  are  accessed  by  following  all  the  pointers  PMV  t  from  NMV 
to  distinguish  a  Pmv.i  error  from  aV^,  error.  Ny  is  accessed  by  following  Pi0©P,  x©  •  •  •  ©P^^V, 
to  distinguish  an  Ap^  error  from  a  V,  error  or  a  Pi4  error.  The  latter  two  errors  are  distinguished 
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Table  5a.  Detection  and  Diagnosis  Locks  for  Forward  Moves 
in  the  VBT  using  W3. 


Detection  Locks 


Diagnosis  Locks 


©Pj^QVj)  Access  N*  via  Pmv,  for  0  <  t  <  2m 


n(A1.j?=Pz.0©---©PzJm©Vz) 


Access  via  PM  for  0  <  s  <  2 m  and  s  s*  j 


Table  5b.  Error  Detection  and  Diagnosis  Syndromes  for  Errors  Detected 
by  Forward  Moves  in  the  VBT  using  w  . 


by  accessing  the  nodes  by  following  all  the  pointers  Pi4  from  N,.  Once  the  error  has  been  diag¬ 
nosed.  correction  proceeds  as  follows: 

1)  Apf,,  error:  compute  Ap^  from  Pl0©Pu©  •  •  •  ©Plim©V,.  from  which  Ap,,, 
can  be  calculated. 


*.  .%  JV 
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2)  Pp nr,  error:  correct  value  is  A,. 


3)  A,  error:  correct  value  is  PprrTJ. 

4)  P,j  error:  correct  value  is  A^. 

5)  PM  error:  correct  value  is  A^^P,  0©  •  •  •  ©P^eP,^©  •  •  •  ©Pi(2m©V,. 

6)  V,  error:  correct  value  is  Ap^SP^SPjj©  •••  ©P^. 

7)  A^  error:  correct  value  is  Py. 

*)  PMv.t  error:  correct  value  is  A,j©PMV0©  •••  ©PMv.t-i®PMV.t+i®  •** 
®Pmv>,®Vmv. 

9)  VMV  error:  correct  value  is  A^P^ ^©P,^  r©  •  •  •  ©P^^. 

Assume  now  that  the  error  has  been  detected  during  a  backward  move  from  N,  to  NMV  fol¬ 
lowing  Pii0©Pi.i©  '  *  *  ©Pl<2m©V,.  The  LCED  procedure  supplies  the  values  of  the  four  detection 
locks  (Table  6a).  and  three  error  indication  values  generated  by  a  node  access  routine. 
NAnwtt.  NAmv.  NA).  that  indicate  out-of-bounds  pointers  or  pointers  that  do  not  point  to  logical 
pointer  boundaries,  when  used  to  access  Naert.  NMV  and  N,.  respectively.  There  are  eight  possible 
errors:  1)  A^  error.  2)  Pn€XtJ1  error  where  P^  is  the  pointer  from  Nn<xt  to  NMV.  3)  A^  error.  4) 
PMV,t  for  0  <  t  <  2m.  3)  error.  6)  Aj  error.  7)  Pt  j  error  for  0  ^  j  ^  2m.  and  8)  V,  error. 
To  distinguish  the  eight  errors,  the  seven-tuple  syndrome  {LI.  L2.  L3.  L4.  NAn^.  NA^.  NA,}  is 
constructed  (Table  6b).  For  the  error-free  case,  the  syndrome  will  be  {True.  True.  True.  True. 
True.  True.  True}.  There  are  two  cases  of  identical  syndromes  for  different  errors.  In  each  case 
extra  nodes  are  accessed  to  completely  diagnose  the  error.  The  nodes  N,  are  accessed  by  following 
all  the  pointers  Pt  j  from  N,  to  distinguish  a  P4j  error  from  a  Vt  error.  Ny  is  accessed  by  following 
Pn«t.o®P!uxu®  ‘  '  ®Pn«nA»®Vam  10  distinguish  a  P^^  error  from  a  error  or  a  P^,  error. 
The  latter  two  errors  are  distinguished  by  accessing  the  nodes  by  following  ail  the  pointers 
P\tv,t  fro®  Once  the  error  has  been  diagnosed,  correction  proceeds  as  follows: 
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1)  error:  compute  An„tJI  from  PMv.o®pmv.i®  '  ‘  *  ®pmv>®vmv-  from 
which  Antxt  can  be  calculated. 

2)  pa«xt4  error:  correct  value  is  A^. 

Table  6a.  Detection  and  Diagnosis  Locks  for  Backward  Moves 
in  the  VBT  using  W3. 


Detect 

Lon  Locks 

LI 

L2 

AMV  t  7=  Pi  0©  *  •  *  ©Pi  2m  ® V[ 

L3 

AMV  PmitJ 

L4 

A1  ?=  PMVt _ 

Diagnosis  Locks 


L5 


n  cau?=  pi.o©---©pU®vx) 


J=fl_ 


Access  N,  via  P,j  for  0  <  j  <  2m 


L6 


Access  Ny  via  Pn^0<8  ■  •  •  ©P^^V^  »  AYj 


MV 


L7 


n  (amv.«  pz.o®  ■ 

_taQ _ 


•  ©pU®v4) 


Access  via  Pmv.i  for  0  <  t  ^  2m 


Table  6b.  Error  Detection  and  Diagnosis  Syndromes  for  Errors  Detected 
by  Backward  Moves  in  the  VBT  using  W3. 


error 

Ll 

L2 

L3 

L4 

EZPM 

B3HMB 

mnm\ 

L5 

L6 

L7 

F 

T 

T 

T 

T 

T 
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- 

- 

gggji 

T 

T 

F 

T 

T 

T 

T 

- 

F 

- 

EHI 

T 

F 

F 

T 

T 

T 

T 

- 

- 

- 

PMV.t 

mm 

mm 

F 

mm 

T 

EE 

m 

mm 

F 

El 

U 

F 

Bl 

F 

EE 

mm 

F 

^MV 

mm 

mm 

F 

mm 
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m i 

BE 

mm 

S9 

El 

Ui 

F 

in 

F 

BE 

n 

El 

A, 

T 

T 

T 

F 

T 

T 

T 

- 

- 

- 

-Pu 

T 

F 

T 

T 

T 

T 

T 

F 

- 

- 

mm 

T 

F 

T 

T 

T 

T 

T 

IBB 

■ 

- 

3)  Amv  error:  correct  value  is  Pntxu. 

4)  PMV  t  error:  To  correct  the  error  in  PMV  t.  first  access  the  headers  of  the  struc¬ 
ture.  Next,  move  forward,  accessing  nodes  N0.  N1(  •  •  •  .  Nk.  performing  W3 
LCED  and  correcting  single  errors  with  (Xl)  LCEC.  until  Pk4  =  AMV.  Then 
the  correct  value  of  P^  t  is  Ak4©PMV0©  •  •  •  ©PMv.t-i®PMv.t+ 1®  ' '  * 

®PMV_2m®VMV* 

5)  VMV  error:  To  correct  the  error  in  VMV,  first  access  the  headers  of  the  struc¬ 
ture.  Next,  move  forward,  accessing  nodes  N0.  Nj.  •  •  •  .  Nk.  performing  W3 
LCED  and  correcting  single  errors  with  0(1)  LCEC.  until  Pk4  *  A^.  Then 
the  correct  value  of  toAt&mjflPtan*  •  •  •  ©Pmv^- 

6)  A1  error:  correct  value  is  P.wv.r 

7)  Pt  j  error:  correct  value  is  Aj^y  ,©P,  0©  -  •  •  ©P,j_i©PlJ+1©  *  •  *  ®pij«,®vi. 

8)  V,  error  correct  value  is  A>MV  ,©Pt  0©PU©  •  •  •  ©PlJm .  □ 

The  robust  B-tree  [3]  presented  by  Black.  Taylor  and  Morgan  performs  double  error  detection 
or  single  error  correction  in  O (n)  time,  and  requires  2m+3  extra  fields  in  each  node  of  an  order-m 
B-tree.  Taylor  and  Black  have  also  developed  the  LB-Tree  [10]  which  is  locally  correctable,  in  that 
it  can  correct  many  single  errors  if  they  occur  in  separate  substructures.  However,  in  order  to  ver¬ 
ify  a  pointer,  one  level  of  nodes  must  be  traversed,  and  to  correct  a  pointer,  all  the  levels  above  the 
current  level  must  be  traversed.  Hence,  double  error  detection  and  single  error  correction  require 
0(n)  time,  and  2 m+5  extra  fields  in  each  node  of  an  order-m  B-tree  are  required.  In  comparison,  the 
advantages  of  the  VBT  are  as  follows: 

1)  Double  pointer  errors  can  be  detected  in  the  VBT  using  an  (Xl)  LCED  pro¬ 
cedure. 


2)  Single  pointer  errors  can  be  corrected  in  the  VBT  using  an  (Xl)  LCEC  pro¬ 
cedure  for  an  error  detected  during  a  forward  move,  or  using  an  (Xlog^n) 


29 


LCEC  procedure  for  an  error  detected  during  a  backward  move. 

3)  The  VBT  requires  only  m+4  extra  fields  in  each  node. 

4)  The  virtual  backpointer  facilitates  backward  traversals  of  the  VBT.  which  can 
then  be  used  to  enhance  performance. 


IV.  ANALYSIS  AND  IMPLEMENTATION  OF  A  CONCURRENT 

AUDITOR  PROCESS 


The  Concurrent  Auditor  Process  (CAP)  is  an  on-line  process  for  error  detection  and  correction 
that  runs  in  parallel  with  user  processes  accessing  a  database.  It  is  used,  in  this  case,  to  perform 
data  structure  error  detection  and  correction  for  the  user  processes,  and  allows  concurrent  access  to 
structures  being  checked  to  reduce  the  system  performance  degradation  due  to  error  detection. 
Koved  and  Waldbaum  have  developed  an  auditor  program  that  provides  detection  of  computer 
subsystem  failures  [18],  based  on  Waldbaum 's  concept  of  the  auditor  program  [19].  Taylor.  Mor¬ 
gan  and  Black  have  suggested  the  use  of  an  audit  program  to  periodically  perform  error  detection 
and  correction  in  data  structures  [l].  However,  little  analysis  has  been  performed  on  the 
effectiveness  of  such  an  audit  program.  This  section  presents  an  analysis  of  the  effectiveness  of  the 
CAP  and  presents  measurements  of  the  CAP'S  effectiveness  in  a  Sequent  Balance  8000  multiproces¬ 
sor  implementation  using  a  database  of  VDLL. 

The  CAP  described  here  accesses  structures  more  frequently  and  uniformly  than  user 
processes  to  reduce  the  latency  of  error  detection.  Also,  the  CAP  performs  error  detection  in 
Checking  Windows  of  higher  cost  than  those  used  by  user  processes,  to  reduce  their  performance 
degradation.  For  example,  if  the  database  is  composed  of  VDLL  or  VBT  instances,  user  processes 
may  perform  single  pointer  error  detection  in  W2  with  less  computation  cost,  while  relying  on  the 
CAP  to  detect  the  less-frequent  double  pointer  errors  in  W3  with  more  computation  cost.  The 
effectiveness  of  the  CAP  is  determined  by  its  increase  of  the  mean  time  to  failure  (MTTF)  of  the 
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system.  Ideally,  a  large  increase  is  achieved  with  little  degradation  of  user  process  performance. 
Hence,  the  CAP  permits  user  processes  to  access  structures  being  checked  as  long  as  they  do  not 
insert  or  delete  nodes  from  the  CAP’S  current  Checking  Window.  Expressions  are  derived  to  deter¬ 
mine  the  MTTF  in  a  multi-user,  n-process  system  with  and  without  the  use  of  the  CAP.  This  is 
followed  by  the  results  of  an  implementation  of  the  CAP  using  a  VDLL  database. 


A.  Analysis 

In  a  multi-user,  n-process  shared-database  environment,  assume  that  the  CAP  performs  error 
detection  in  W3  and  that  user  processes  perform  error  detection  in  W2.  The  pointer  errors  can  then 
be  divided  into  three  classes:  E0.  Ex  and  E,.  Eg  errors  are  those  which  can  be  detected  by  a  user 
process  or  by  the  CAP.  Et  errors  can  be  detected  by  the  CAP  but  not  by  a  user  process.  Ej  errors 
can  be  detected  by  neither  a  user  process  nor  the  CAP.  Suppose  the  time  for  an  E,  error  to  occur  is 
T^.  the  time  for  a  user  process  to  encounter  that  error  is  Ty.  and  the  time  for  the  CAP  to  detect  an 

E,  error  is  TA.  For  the  purposes  of  analysis  assume,  in  a  given  time  interval,  both  the  number  of 
errors  that  occur  and  the  number  of  accesses  to  a  particular  node  are  random  variables  following  a 
Poisson  distribution.  Then,  random  variables  T^.  Tv  and  TA  follow  an  exponential  distribution 

with  mean  time  y".  0  and  a.  respectively. 

LEMMA  1:  The  probability  of  an  Et  error  causing  any  of  the  n  processes  to  fail  in  the  presence 


of  the  CAP  is  1— 


PROOF:  For  a  single  process,  the  probability  to  fail  can  be  derived  using  basic  probability 


theory: 


1  c( 

Prob(TA>Tu)  *  /Prob(Tu*x)Prob(TA>x)dx  »  / — e-x/*e_xy*dx  *  - 

.  0  ® 


nK  <t.  .1 .  *>  >*. 
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Therefore,  the  probability  of  any  of  the  n  processes  failing  is  1—  1- 


Theorem  7:  Without  the  use  of  the  CAP.  MTTF  =  y  “  +  0.  and  with  the  use  of  the  CAP. 


MTTFcap  *  min 


•  y\  +  0* 


PROOF:  If  no  CAP  is  used.  MTTF*  minCECT^).  ECT^))  +  E(T0)  =  min(y?.  y£)  +  0  = 
y*  +  0.  where  E(X)  is  the  expected  value  of  random  variable  X. 

In  the  presence  of  the  CAP.  the  determination  of  whether  an  Et  error  will  cause  a  failure  can 

ft 

0  _ 

be  modeled  as  a  Bernoulli  trial  with  parameter  p  =  1 - .  Hence  the  MTTFc^p  follows  a 

o+0 

geometric  distribution  with  mean  - .  where  n'  represents  the  effect  of  n  user  processes  and  the 

P 


If  Ex  and  E2  errors  are  formed  by  the  accumulation  of  Eq  errors,  then  T^  and  Tj^  are  propor¬ 
tional  to  the  access  frequency.  Thus  y"  =  ny*.  y£  -  ny\  and  y\  »  yj-  This  gives,  for  the 

_  0 

without-CAP  case.  MTTF  *  y"  +  0  *  nyx  +  0.  In  the  with-CAP  case,  since  the  CAP  is  —  times 

o 

Jt.  0 

faster  in  checking  the  data  structure  than  a  user  process,  y"  =  n+ —  yx.  E?  errors  will  retain  an 

a 


exponential  distribution  but  with  different  mean  >2 


n+—  y2. 
a 


For  this  case  the  theorem  gives 


3  , 

n+—  yx 
a 


MTTFcaf  *  min 


J  » 

.  n+—  y2 
a 
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EXAMPLE  4:  Suppose  yl  —  100  hours,  y2  =  10.000  hours.  0  =  1  minute,  and  5  user  processes 
are  active  on  the  system.  Without  the  use  of  the  CAP.  MTTF  =*  500  hours.  However,  by  using  the 
CAP.  and  with  a  =  10  seconds,  MTTF  is  increased  to  MTTF^p  =2050  hours.  □ 


If  a  is  small  enough  (i.e..  the  CAP  is  fast  enough),  the 


term  can  exceed  the 


0  i 

k  + —  y2  term.  In  this  case.  MITF^  = 
l  or 


|n  +— |y2+0-  Thi 


This  effectively  eliminates  the  chances  of 


a  user  process  failure  due  to  Ex  errors,  which  occur  more  often  than  Ej  errors. 


B.  ImpUmaUation 

A  model  database  of  VDLL  was  implemented  in  C  and  run  on  a  Sequent  Balance  8000 
shared-memory  multiprocessor  system  with  six  CPUs.  Single  random  errors  and  worst-case  double 
errors  (called  "double  cooperative  errors."  where  a  second  error  masks  a  previous  error)  were 
injected  into  the  database  one  at  a  time.  Error  detection  was  accomplished  by  one  of  four  user 
processes,  the  database  manager,  or  the  CAP.  each  of  which  performed  either  W2  or  W3  checking. 
The  database  manager  serviced  all  update  requests,  and  the  CAP  operated  in  the  idle  time  of  the 
database  manager,  to  reduce  performance  degradation.  Databases  of  50,  100,  500  and  1000  nodes 
were  used  in  the  simulations.  Each  database  consisted  of  eight  VDLL  instances:  six  non-empty 
instances,  one  empty  instance,  and  a  free  list.  To  model  the  locality  of  user  process  database  access, 
each  user  process  performed  approximately  80%  of  its  operations  (composed  of  75%  searches.  12.5% 
insertions  and  12.5%  deletions)  within  one  VDLL.  and  the  other  20%  in  a  randomly  selected  VDLL. 

For  each  single  or  double  error  injected,  the  detection  latency  and  the  number  of  operations 
completed  in  that  time  were  measured,  for  five  different  combinations  of  user  process  LCED/CAP 
LCED  (Table  7).  The  mean  error  detection  latencies  for  the  five  combinations,  applied  to  databases 
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of  30.  100,  500  and  1000  nodes,  are  shown  in  Table  8.  Table  9  shows  by  what  factor  use  of  the 
CAP  can  decrease  the  error  detection  latency.  The  following  observations  can  be  made  based  on  the 
results  of  the  implementation: 

1)  Single  and  double  LCED  can  be  performed  on  the  VDLL  in  0(1)  time. 

2)  The  use  of  the  CAP  significantly  reduces  the  error  detection  latency  of  both 
single  random  errors  and  double  cooperative  errors. 

3)  The  CAP  is  more  effective  in  reducing  the  detection  latency  of  single  random 
errors  as  the  size  of  the  database  increases. 

Using  the  analysis  results  of  the  previous  section,  the  first  observation  shows  that  y"  ^  5y* . 
Thus  from  Theorem  5.  the  MTTFCA?  >  5XMTTF.  This  clearly  shows  the  utility  of  the  CAP  in 
increasing  the  MTTF  of  the  system. 
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V.  SUMMARY 

In  this  paper,  we  have  presented  a  new  technique  for  local  concurrent  error  detection  in  linked 
data  structures  that  can  achieve  0(1)  error  detection  in  a  variety  of  data  structures.  This  tech¬ 
nique  uses  the  concept  of  a  Checking  Window  to  define  the  locality  in  which  local  concurrent  error 
detection  is  performed  and  also  to  determine  the  associated  cost  of  the  locality.  The  virtual  back- 
pointer  was  introduced  and  used  to  define  two  new  data  structures,  the  Virtual  Double-Linked 
List,  which  incurs  no  storage  overhead,  and  the  B-Tree  with  Virtual  Backpointers  of  order  m. 
which  requires  m+4  extra  fields  per  node.  It  was  shown  that  double  errors  could  be  detected  using 
a  local  concurrent  error  detection  procedure  in  0(1)  time  for  both  structures.  In  addition,  those 
errors  detected  during  forward  moves  were  shown  to  be  correctable  using  a  local  concurrent  error 
correction  procedure  in  (Xl)  time.  Correction  of  those  errors  detected  during  backward  moves  was 
shown  to  be.  in  worst  case.  0(a).  Finally,  an  analysis  and  implementation  of  a  concurrent  auditor 


None 

W3 


Table  8.  Mean  Error  Detection  Latencies. 


Database 

Size 

Number  of 
Samples 

Single 

30 

10000 

Random 

100 

10000 

Error 

500 

1800 

1000 

200 

Double 

50 

10000 

Cooperative 

100 

10000 

Error 

500 

1800 

39 

7 

>  57 

11 

447 

50 

Table  9.  Detection  Latency  Reduction  Factor  Tlirough  Use  of  the  CAP. 


Cases  Compared 
1:2  1:3  4£ 


Database 

Size 

Single 

50  ' 

Random 

100 

Error 

500 

1000 

Double 

50 

Cooperative 

100 

Error 

500 

i  !»'  '•  ' 


process  in  a  shared  database  using  the  virtual  backpointer  technique  was  shown  to  significantly 
reduce  the  error  detection  latency. 
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